Mixed Language Query Disambiguation
نویسندگان
چکیده
We propose a mixed language query disambiguation approach by using co-occurrence information from monolingual data only. A mixed language query consists of words in a primary language and a secondary language. Our method translates the query into monolingual queries in either language. Two novel features for disambiguation, namely contextual word voting and 1-best contextual word, are introduced and compared to a baseline feature, the nearest neighbor. Average query translation accuracy for the two features are 81.37% and 83.72%, compared to the baseline accuracy
منابع مشابه
Query Translation Disambiguation as Graph Partitioning
Resolving ambiguity in the process of query translation is crucial to cross-language information retrieval when only a bilingual dictionary is available. In this paper we propose a novel approach for query translation disambiguation, named “spectral query translation model”. The proposed approach views the problem of query translation disambiguation as a graph partitioning problem. For a given ...
متن کاملUsing co-occurrence tendencies to improve Cross-Language Information Retrieval
Query disambiguation is considered as one of the most important methods in improving the effectiveness of information retrieval. In the present paper, we focus on query terms disambiguation via, a combined statistical method both before and after translation, in order to avoid source language ambiguity as well as incorrect selection of target translations. By combining query expansion with dict...
متن کاملCross-Language Information Retrieval via Hybrid Combination of Query Expansion Techniques
This paper describes a new approach in Cross-Language Information Retrieval that combines query expansion techniques before and after query translation and disambiguation. Moreover, a new technique based on domain keywords extraction is proposed. Test results showed the effectiveness of the combined method.
متن کاملTranslation Probabilities in Cross-language Information Retrieval
Translation ambiguity is a major problem in dictionary-based cross-language information retrieval. To attack the problem, indirect disambiguation approaches, which do not explicitly resolve translation ambiguity, rely on query-structuring techniques such as a structured Boolean model and Pirkola’s method. Direct disambiguation approaches try to assign translation probabilities to translation eq...
متن کاملAmbiguity of Queries and the Challenges for Query Language Detection
In this paper, a sample set of 510 simple searches from the TEL action log 2009 is analyzed for query content and query language. More than half of the queries are for named entities, which has consequences for query language disambiguation. A manual identification of query language finds that often a definite language cannot be determined, because many named entities are not translated. Proble...
متن کامل